Speeding Up MCMC by Efficient Data Subsampling
نویسندگان
چکیده
منابع مشابه
Speeding Up Relational Data Mining by Learning to
he motivation behind multi-relational data mining is wledge discovery in relational databases containing tiple related tables. One difficulty relational data ing faces is managing intractably large hypothesis ces. We attempt to overcome this difficulty by first pling the hypothesis space. We generate a small set of otheses, uniformly sampled from the space of didate hypotheses, and evaluate thi...
متن کاملSpeeding up correlation search for binary data
Finding the most interesting correlations in a collection of items is essential for problems in many commercial, medical, and scientific domains. Much previous research focuses on finding correlated pairs instead of correlated itemsets in which all items are correlated with each other. Though some existing methods find correlated itemsets of any size, they suffer from both efficiency and effect...
متن کاملAn Adaptive Subsampling Approach for MCMC Inference in Large Datasets
Markov chain Monte Carlo (MCMC) methods are often deemed far too computationally intensive to be of any practical use for large datasets. This paper describes a methodology that aims to scale up the Metropolis-Hastings (MH) algorithm in this context. We propose an approximate implementation of the accept/reject step of MH that only requires evaluating the likelihood of a random subset of the da...
متن کاملSpeeding up deciphering by hypergraph ordering
The “Gluing Algorithm” of Semaev [Des. Codes Cryptogr. 49 (2008), 47–60] — that finds all solutions of a sparse system of linear equations over the Galois field GF (q) — has average running time O(mq| k 1Xj |), where m is the total number of equations, and ∪k1Xj is the set of all unknowns actively occurring in the first k equations. Our goal here is to minimize the exponent of q in the case whe...
متن کاملSpeeding Up FastICA by Mixture Random Pruning
We study and derive a method to speed up kurtosis-based FastICA in presence of information redundancy, i.e., for large samples. It consists in randomly decimating the data set as more as possible while preserving the quality of the reconstructed signals. By performing an analysis of the kurtosis estimator, we find the maximum reduction rate which guarantees a narrow confidence interval of such ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of the American Statistical Association
سال: 2018
ISSN: 0162-1459,1537-274X
DOI: 10.1080/01621459.2018.1448827